Chloé Winters
STA 551
Fall 2025
Introduction
Methodology & Analysis
Results & Conclusion
General Discussion
Telecommunication customer service data
1,000 customer observations
14 variables
Three logistic models were evaluated:
Full model
Reduced model
Stepwise-selected model
Models were compared using ROC curves and AUC on a held-out test set.
| Predictor | Odds Ratio | p-value |
|---|---|---|
| (Intercept) | 1.410 | 0.751 |
| SexMale | 0.941 | 0.777 |
| Marital_StatusSingle | 1.022 | 0.961 |
| Term | 0.927 | 0.000 |
| Phone_serviceYes | 1.005 | 0.994 |
| International_planYes | 1.106 | 0.700 |
| Voice_mail_planYes | 0.852 | 0.463 |
| Multiple_lineNo phone | NA | NA |
| Multiple_lineYes | 0.853 | 0.543 |
| Internet_serviceDSL | 0.588 | 0.261 |
| Internet_serviceFiber optic | 0.843 | 0.538 |
| Internet_serviceNo Internet | 0.145 | 0.038 |
| Technical_supportNo internet | NA | NA |
| Technical_supportYes | 0.665 | 0.142 |
| Streaming_VideosNo internet | NA | NA |
| Streaming_VideosYes | 1.365 | 0.369 |
| Agreement_periodOne year contract | 0.200 | 0.000 |
| Agreement_periodTwo year contract | 0.119 | 0.000 |
| Monthly_Charges | 1.003 | 0.857 |
| Total_Charges | 1.001 | 0.003 |
| Predictor | Odds Ratio | p-value |
|---|---|---|
| (Intercept) | 1.630 | 0.061 |
| Term | 0.926 | 0.000 |
| Internet_serviceDSL | 0.556 | 0.065 |
| Internet_serviceFiber optic | 0.894 | 0.671 |
| Internet_serviceNo Internet | 0.129 | 0.000 |
| Agreement_periodOne year contract | 0.181 | 0.000 |
| Agreement_periodTwo year contract | 0.101 | 0.000 |
| Total_Charges | 1.001 | 0.000 |
| Predictor | Odds Ratio | p-value |
|---|---|---|
| (Intercept) | 0.606 | 0.319 |
| Term | 0.927 | 0.000 |
| Technical_supportNo internet | 0.270 | 0.011 |
| Technical_supportYes | 0.637 | 0.080 |
| Agreement_periodOne year contract | 0.200 | 0.000 |
| Agreement_periodTwo year contract | 0.117 | 0.000 |
| Monthly_Charges | 1.012 | 0.079 |
| Total_Charges | 1.001 | 0.003 |
| Metric | Value |
|---|---|
| error | 0.0126 |
| reached.threshold | 0.0095 |
| steps | 100.0000 |
| Intercept.to.1layhid1 | -1.5170 |
| SexFemale.to.1layhid1 | -0.5295 |
| SexMale.to.1layhid1 | -0.6308 |
| Marital_StatusSingle.to.1layhid1 | -0.2157 |
| Phone_serviceYes.to.1layhid1 | -1.9458 |
| International_planYes.to.1layhid1 | -0.3901 |
| Voice_mail_planYes.to.1layhid1 | -0.2306 |
| Multiple_lineNo.phone..to.1layhid1 | -2.2922 |
| Multiple_lineYes.to.1layhid1 | -0.0478 |
| Internet_serviceDSL.to.1layhid1 | -0.4440 |
| Internet_serviceFiber.optic.to.1layhid1 | -0.3280 |
| Internet_serviceNo.Internet.to.1layhid1 | -0.8579 |
| Technical_supportNo.internet..to.1layhid1 | -0.6097 |
| Technical_supportYes.to.1layhid1 | -0.1890 |
| Streaming_VideosNo.internet..to.1layhid1 | 0.6980 |
| Streaming_VideosYes.to.1layhid1 | -0.0192 |
| Agreement_periodOne.year.contract.to.1layhid1 | -0.5605 |
| Agreement_periodTwo.year.contract.to.1layhid1 | -0.4707 |
| ChurnYes.to.1layhid1 | 10.8148 |
| Term_scale.to.1layhid1 | -1.3392 |
| Monthly_Charges_scale.to.1layhid1 | 0.5239 |
| Total_Charges_scale.to.1layhid1 | 0.0469 |
| Intercept.to.Churn_num | -5.2526 |
| 1layhid1.to.Churn_num | 10.1725 |
| Metric | Value |
|---|---|
| error | 49.5837 |
| reached.threshold | 0.0086 |
| steps | 2002.0000 |
| Intercept.to.1layhid1 | -0.2363 |
| Term_scale.to.1layhid1 | 5.9524 |
| Total_Charges_scale.to.1layhid1 | -7.0861 |
| Agreement_periodOne.year.contract.to.1layhid1 | 1.2362 |
| Agreement_periodTwo.year.contract.to.1layhid1 | 2.5294 |
| Intercept.to.Churn_num | 3.1122 |
| 1layhid1.to.Churn_num | -6.8599 |
Model AUC
1 Full Logistic 0.8042817
2 Reduced Logistic 0.7915127
3 Stepwise Logistic 0.8081395
Model AUC
1 Full Perceptron 1.0000000
2 Reduced Perceptron 0.7410074
Final optimal model is the stepwise logistic regression
Due to the smaller AUC and more stability
Good amount of variable overlap with other models
\[ \text{Churn} = -0.5002 - 0.0758(\text{Term}) - 1.309(\text{TS_NoInternet}) - 0.451(\text{TS_Yes}) \\ - 1.612(\text{OneYear}) - 2.146(\text{TwoYear}) \\ + 0.01183(\text{MonthlyCharges}) + 0.000642(\text{TotalCharges}) \]
Potential faults
Stepwise model might be harder to interpret to non statisticians
If someone does not understand how to implement a model its useless to them
Questions?